The Quality of the Covariance Selection Through Detection Problem and AUC Bounds
نویسندگان
چکیده
We consider the problem of quantifying the quality of a model selection problem for a graphical model. We discuss this by formulating the problem as a detection problem. Model selection problems usually minimize a distance between the original distribution and the model distribution. For the special case of Gaussian distributions, the model selection problem simplifies to the covariance selection problem which is widely discussed in literature by Dempster [2] where the likelihood criterion is maximized or equivalently the Kullback-Leibler (KL) divergence is minimized to compute the model covariance matrix. While this solution is optimal for Gaussian distributions in the sense of the KL divergence, it is not optimal when compared with other information divergences and criteria such as Area Under the Curve (AUC). In this paper, we analytically compute upper and lower bounds for the AUC and discuss the quality of model selection problem using the AUC and its bounds as an accuracy measure in detection problem. We define the correlation approximation matrix (CAM) and show that analytical computation of the KL divergence, the AUC and its bounds only depend on the eigenvalues of CAM. We also show the relationship between the AUC, the KL divergence and the ROC curve by optimizing with respect to the ROC curve. In the examples provided, we pick tree structures as the simplest graphical models. We perform simulations on fully-connected graphs and compute the tree structured models by applying the widely used Chow-Liu algorithm [3]. Examples show that the quality of tree approximation models are not good in general based on information divergences, the AUC and its bounds when the number of nodes in the graphical model is large. Moreover, we show both analytically and by simulations that the 1 ́AUC for the tree approximation model decays exponentially as the dimension of graphical model increases.
منابع مشابه
Determining the Optimal Value Bounds of the Objective Function in Interval Quadratic Programming Problem with Unrestricted Variables in Sign
In the most real-world applications, the parameters of the problem are not well understood. This is caused the problem data to be uncertain and indicated with intervals. Interval mathematical models include interval linear programming and interval nonlinear programming problems.A model of interval nonlinear programming problems for decision making based on uncertainty is interval quadratic prog...
متن کاملA hybrid CS-SA intelligent approach to solve uncertain dynamic facility layout problems considering dependency of demands
This paper aims at proposing a quadratic assignment-based mathematical model to deal with the stochastic dynamic facility layout problem. In this problem, product demands are assumed to be dependent normally distributed random variables with known probability density function and covariance that change from period to period at random. To solve the proposed model, a novel hybrid intelligent algo...
متن کاملExtracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...
متن کاملHerbal plants zoning using target detection algorithms on time-series of Sentinel-2 multispectral images (Amygdalus Scoparia)
Today, medicinal plants have a special place in the economy and health of a society. Due to the natural growth of many of these products, the necessity of zoning them for optimum and optimal utilization seems necessary. Traditional zoning solutions are not efficient due to their low accuracy and speed, therefore a new approach is needed. Remote sensing data have many applications in various fie...
متن کاملProposed new signal for real-time stress monitoring: Combination of physiological measures
Human stress is a physiological tension that appears when a person responds to mental, emotional, or physical chal-lenges. Detecting human stress and developing methods to manage it, has become an important issue nowadays. Au-tomatic stress detection through physiological signals may be a useful method for solving this problem. In most of the earlier studies, long-term time window was considere...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1605.05776 شماره
صفحات -
تاریخ انتشار 2016